Analyzing the Effect of Different Programming Models Upon Performance and Memory Usage on Cray XT5 Platforms
نویسندگان
چکیده
Harnessing the power of multicore platforms is challenging due to the additional levels of parallelism present. In this paper, we examine the effect of the choice of programming model upon performance and overall memory usage on the Cray XT5. We use detailed time breakdowns to measure the contributions to the total runtime from computation, communication, and OpenMP regions of the applications, gaining insights into the reasons behind any performance differences observed. We also examine the performance differences between two different Cray XT5 machines, which have quad-core and hex-core processors.
منابع مشابه
An application-level parallel I/O library for Earth system models
We describe the design and implementation of an application-level parallel I/O (PIO) library for the reading and writing of distributed arrays to several common scientific data formats. PIO provides the flexibility to control the number of I/O tasks through data rearrangement to an I/O friendly decomposition. This flexibility enables reductions in per task memory usage and improvements in disk ...
متن کاملAdjacency-based data reordering algorithm for acceleration of finite element computations
Effective use of the processor memory hierarchy is an important issue in high performance computing. In this work, a part level mesh topological traversal algorithm is used to define a reordering of both mesh vertices and regions that increases the spatial locality of data and improves overall cache utilization during on processor finite element calculations. Examples based on adaptively create...
متن کاملTitan: Early experience with the Cray XK6 at Oak Ridge National Laboratory
In 2011, Oak Ridge National Laboratory began an upgrade to Jaguar to convert it from a Cray XT5 to a Cray XK6 system named Titan. This is being accomplished in two phases. The first phase, completed in early 2012, replaced all of the XT5 compute blades with XK6 compute blades, and replaced the SeaStar interconnect with Cray’s new Gemini network. Each compute node is configured with an AMD Opter...
متن کاملPerformance analysis of pure MPI versus MPI+OpenMP for Jacobi Iteration and a 3D FFT on the Cray XT5
Today many high performance computers are collections of shared memory compute nodes with each compute node having one or more multi-core processors. When writing parallel programs for these machines, one can use pure MPI or various hybrid approaches using MPI and OpenMP. Since OpenMP threads are lighter weight than MPI processes, one would expect that hybrid approaches will achieve better perf...
متن کاملCommunication Characteristics and Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-core SMP Nodes
Hybrid MPI/OpenMP and pure MPI on clusters of multicore SMP nodes involve several mismatch problems between the parallel programming models and the hardware architectures. Measurements of communication characteristics between cores on the same socket, on the same SMP node, and between SMP nodes on several platforms (including Cray XT4 and XT5) show that machine topology has a significant impact...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010